Acoustics-based baseform generation with pronunciation and/or phonotactic models

نویسندگان

Bhuvana Ramabhadran

Sabine Deligne

Abraham Ittycheriah

چکیده

In this paper, we describe a method to derive a phonetic pronunciation of a word using only an acoustic utterance of that word without a priori knowledge of the spelling of the word. In [5] and [6], we used a pronunciation model based on bigram statistics. Bi-gram statistics only constrain the left neighbor phone and results in phone sequences that are only pairwise appropriate. Here, we apply a pronunciation model in combination with a phonotactic model that serves the purpose of a language model to constrain the phone sequences produced. Error rates with and without the phonotactic model are presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Pronunciation Modeling by Properly Integrating Better Approaches for Baseform Generation, Ranking and Pruning

In this paper, a complete framework for pronunciation modeling process is discussed and analyzed as the integration of three individual but mutual-interactive stages, i.e., the baseform generation, baseform ranking, and baseform pruning stage. The characteristics of different techniques used in each stage and the interaction among them are then well reflected on the overall performance of pronu...

متن کامل

On the Adequacy of Baseform Pronunciations and Pronunciation Variants

This paper presents an approach to automatically extract and evaluate the “stability” of pronunciation variants (i.e., adequacy of the model to accommodate this variability), based on multiple pronunciations of each lexicon words and the knowledge of a reference baseform pronunciation. Most approaches toward modelling pronunciation variability in speech recognition are based on the inference (t...

متن کامل

Modeling Cantonese pronunciation variation by acoustic model refinement

Pronunciation variations can be roughly classified into two types: a phone change or a sound change [1][2]. A phone change happens when a canonical phone is produced as a different phone. Such a change can be modeled by converting the baseform (standard) phone to a surfaceform (actual) phone. A sound change happens at a lower, phonetic or subphonetic level within a phone and it cannot be modele...

متن کامل

Confidence Measures for Evaluating Pronunciation Models

In this paper, we investigate the use of confidence measures for the evaluation of pronunciation models and the employment of these evaluations in an automatic baseform learning process. The confidence measures and pronunciation models are obtained from the ABBOT hybrid Hidden Markov Model/Artificial Neural Network (HMM/ANN) Large Vocabulary Continuous Speech Recognition (LVCSR) system [8]. Exp...

متن کامل

Pronunciation ambiguity vs. pronunciation variability in speech recognition

It is widely acknowledged that pronunciations in spontaneous speech di er signi cantly from citation form. For this reason, pronunciation modeling has received considerable attention in recent automatic speech recognition literature. Most of the attention however has focussed on describing an alternate pronunciation as a di erent sequence of phonetic units using the same inventory of phones whi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Acoustics-based baseform generation with pronunciation and/or phonotactic models

نویسندگان

چکیده

منابع مشابه

Improved Pronunciation Modeling by Properly Integrating Better Approaches for Baseform Generation, Ranking and Pruning

On the Adequacy of Baseform Pronunciations and Pronunciation Variants

Modeling Cantonese pronunciation variation by acoustic model refinement

Confidence Measures for Evaluating Pronunciation Models

Pronunciation ambiguity vs. pronunciation variability in speech recognition

عنوان ژورنال:

اشتراک گذاری